a) clustering: select representative samples and remove outliers. clustering based on loss, gradient, etc.
b) data contribution: measure the contribution of each sample
- The performance difference using or without using this sample
c) learn the weights of training samples: train with weighted loss and test on the validation test